Domain adaptation methods in the IBM trainable text-to-speech system

نویسندگان

Volker Fischer

Jaime Botella Ordinas

Siegfried Kunzmann

چکیده

This paper presents a comparison of domain adaptation techniques for a unit selection based text-to-speech system. The methods under investigation consider two different prerequisites, namely the absence and the existence of additional domain specific training prompts, spoken by the original voice talent. Whereas in the first case we employ domain specific pre-selection, for the latter we compare a variety of methods that range from a simple extension of the segment inventory to a complete reconstruction of the system, which also includes the training of decision trees for the domain dependent prediction of prosody targets. An experimental evaluation of the methods under consideration unveils significant improvements (up to 1.1 on a 5 point MOS scale) over the baseline system for sentences from the target domain, while showing no significant degradation when synthesizing sentences from other than the adaptation domain.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Current status of the IBM Trainable Speech Synthesis System

This paper describes the current status of the IBM Trainable Speech Synthesis System. The system is a state-of-the-art, trainable, unit-selection based concatenative speech synthesiser. The system uses hidden Markov models (HMMs) to provide a phonetic transcription and HMM state alignment of a database of single-speaker continuous-speech training data. The runtime synthesiser uses the HMM state...

متن کامل

Pointwise Prediction and Sequence-Based Reranking for Adaptable Part-of-Speech Tagging

This paper proposes an accurate method for partof-speech (POS) tagging that is highly domain-adaptable. The method is based on an assumption that the POS transition tendencies do not depend on domains, and has the following three characteristics: 1) it is trainable from partially annotated data, 2) it uses efficiently trainable pointwise POS taggers to allow for active learning, and 3) is more ...

متن کامل

Reducing the footprint of the IBM trainable speech synthesis system

This paper presents a novel approach for concatenative speech synthesis. This approach enables reduction of the dataset size of a concatenative text-to-speech system, namely the IBM trainable speech synthesis system, by more than an order of magnitude. A spectral acoustic feature based speech representation is used for computing a cost function during segment selection as well as for speech gen...

متن کامل

Phrase splicing and variable substitution using the IBM trainable speech synthesis system

This paper describes a phrase splicing and variable substitution system which offers an intermediate form of automated speechproduction lying in-between the extremes of recorded utterance playback and full Text-to-Speech synthesis. The system incorporates a trainable speech synthesiser and an application specific set of pre-recorded phrases. The text to be synthesised is converted to a phone se...

متن کامل

Unsupervised Vocabulary Adaptation for Morph-based Language Models

Modeling of foreign entity names is an important unsolved problem in morpheme-based modeling that is common in morphologically rich languages. In this paper we present an unsupervised vocabulary adaptation method for morph-based speech recognition. Foreign word candidates are detected automatically from in-domain text through the use of letter n-gram perplexity. Over-segmented foreign entity na...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2004

Domain adaptation methods in the IBM trainable text-to-speech system

نویسندگان

چکیده

منابع مشابه

Current status of the IBM Trainable Speech Synthesis System

Pointwise Prediction and Sequence-Based Reranking for Adaptable Part-of-Speech Tagging

Reducing the footprint of the IBM trainable speech synthesis system

Phrase splicing and variable substitution using the IBM trainable speech synthesis system

Unsupervised Vocabulary Adaptation for Morph-based Language Models

عنوان ژورنال:

اشتراک گذاری